*----------------------------------------------------------------------------------------------------------	* 
* Silencing the Rails: A Study of the Noise-Safety Trade-off in Railway Quiet Zones                         *
* RESEARCHERS:		Emtiaz Hritan																		    *
* PROGRAMMED BY:	Emtiaz Hritan 					 											            *
* CREATED:			Oct. 25, 2023																		   	*
* LAST MODIFIED:	Oct. 7, 2025														       				*
*----------------------------------------------------------------------------------------------------------	*

clear all
set more off
* Set local paths
* Set this local datapath equal to the folder location for data
	local datapath "C:\Users\name\Replication Files Silencing the Rails\Data"
* Set this local outputpath equal to the folder location for outcome like tables
	local outputpath "C:\Users\name\Replication Files Silencing the Rails\Outcome"

* Install necessary packages
ssc install reghdfe
ssc install ftools
ssc install grstyle, replace
ssc install palettes, replace
ssc install colrspace, replace
ssc install ppmlhdfe, replace 
ssc install jwdid, replace 
ssc install csdid, replace 
ssc install drdid, replace 
ssc install did_imputation, replace
ssc install frause, replace
frause mpdta.dta, clear
ssc install hdfe, replace 
ssc install event_plot, replace
ssc install addplot, replace


************************************************************Heterogeneity Analysis***************
cd "`datapath'"

** For this heterogeneity analysis, I will use JWDiD package. The susb-samples used will be 
*(1) Reasons of accidents due to highway user (2) Time (3) Weather (4) Visibility




********************************************************************************************
*  Table A10—: Heterogeneity Analysis of the Reasons of Accidents Due to Highway User's Action
********************************************************************************************

*Reasons of accidents due to highway user: 
*subsample(a) "Did not stop" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if highwayuseraction=="Did not stop"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple


*********subsample(b) "Stopped on crossing" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if highwayuseraction=="Stopped on crossing"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple

********subsample(c) "Went around the gate" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if highwayuseraction=="Went around the gate"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple


********subsample(d) "Stopped and then proceeded" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if highwayuseraction=="Stopped and then proceeded"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple
eststo clear
* Note: For the new staggered models, I reported the results in the .tex files manually. Please refer to the "Latex Source Files" subfolder of "Outcome" folder. 

*--------------------------------------------------------------------------
* END
*--------------------------------------------------------------------------



********************************************************************************************
* Table A11—: Heterogeneity Analysis of the Visibility Factors of Accidents
********************************************************************************************

 
*subsample(a) "Dark" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if  visibility=="Dark"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple

*subsample(b) "Day" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if  visibility=="Day"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple

eststo clear
* Note: For the new staggered models, I reported the results in the .tex files manually. Please refer to the "Latex Source Files" subfolder of "Outcome" folder. 

*--------------------------------------------------------------------------
* END
*--------------------------------------------------------------------------


********************************************************************************************
* Table A12—: Heterogeneity Analysis of the Timing of Accidents
********************************************************************************************


********subsample(a) "6 AM to 11:59 AM" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 
* Time variable 
gen hour_numeric = hour
* Adjust hour_numeric for AM/PM
replace hour_numeric = hour_numeric + 12 if ampm == "PM" & hour_numeric < 12
replace hour_numeric = hour_numeric - 12 if ampm == "AM" & hour_numeric == 12
gen dummy1 = (hour_numeric >= 6 & hour_numeric < 12)
keep if dummy1==1
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple


********subsample(b) "12 PM to 5:59 PM" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 
* Time variable 
gen hour_numeric = hour
* Adjust hour_numeric for AM/PM
replace hour_numeric = hour_numeric + 12 if ampm == "PM" & hour_numeric < 12
replace hour_numeric = hour_numeric - 12 if ampm == "AM" & hour_numeric == 12
gen dummy2 = (hour_numeric >= 12 & hour_numeric < 18)
keep if dummy2==1
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple



********subsample(c) "6 PM to 11:59 PM" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 
* Time variable 
gen hour_numeric = hour
* Adjust hour_numeric for AM/PM
replace hour_numeric = hour_numeric + 12 if ampm == "PM" & hour_numeric < 12
replace hour_numeric = hour_numeric - 12 if ampm == "AM" & hour_numeric == 12
gen dummy3 = (hour_numeric >= 18 & hour_numeric < 24)
keep if dummy3==1
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple


********subsample(d) "midnight to 5:59 AM" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 
* Time variable 
gen hour_numeric = hour
* Adjust hour_numeric for AM/PM
replace hour_numeric = hour_numeric + 12 if ampm == "PM" & hour_numeric < 12
replace hour_numeric = hour_numeric - 12 if ampm == "AM" & hour_numeric == 12
gen dummy4 = (hour_numeric >= 0 & hour_numeric < 6)
keep if dummy4==1
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple
eststo clear
* Note: For the new staggered models, I reported the results in the .tex files manually. Please refer to the "Latex Source Files" subfolder of "Outcome" folder. 



*--------------------------------------------------------------------------
* END
*--------------------------------------------------------------------------



********************************************************************************************
*  Table A13—: Heterogeneity Analysis of the Weather at the Time of Accidents
********************************************************************************************

********subsample(a) "Clear" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if weathercondition=="Clear"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple


********subsample(b) "Cloudy" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if weathercondition=="Cloudy"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple



********subsample(c) "Rain" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 
keep if weathercondition=="Rain" 
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple


********subsample(d) "Snow" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 
keep if weathercondition=="Snow"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple

eststo clear
* Note: For the new staggered models, I reported the results in the .tex files manually. Please refer to the "Latex Source Files" subfolder of "Outcome" folder. 

*--------------------------------------------------------------------------
* END
*--------------------------------------------------------------------------



********************************************************************************************
*  Table A14—: Heterogeneity Analysis of the Accidents Occurring at Public or Private Highway Crossings
********************************************************************************************


********subsample(a) "Public" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if publicprivate=="Public"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple


********subsample(b) "Private" 
clear all
use final 
// Calculate the count of accidents per railway crossing per month
egen accidents_per_year = total(accident) if !missing(year), by(crossing_id date_monthly)
gen whistle_year=year(WhistleDate)

* Crossing IDs are string.Let's change it:

egen crossingid = group(crossing_id)

duplicates report crossingid year
duplicates drop crossingid year, force 
 
* Let's define "gvar" first. gvar=0 if never treated and gvar = year first treated if ever treated.
gen first_treat= whistle_year
replace first_treat=0 if quietzone ==0 

keep if publicprivate=="Private"
* Keep only the important variable
keep accidents_per_year crossingid year first_treat
* Let's balance this data using tsfill
xtset crossingid year
tsfill, full
replace accidents_per_year=0 if accidents_per_year==.
* Let's replace first_treat
// Sort your dataset by id and year
sort crossingid year

// Create a new variable that checks if income is zero for any year for each id
egen nonmissing_other_year = min(first_treat), by(crossingid)

// Replace missing values of 'other_year' with non-missing values for the same id
replace first_treat = nonmissing_other_year if missing(first_treat)

// Drop the temporary variable nonmissing_other_year
drop nonmissing_other_year

xtset crossingid year

* Drop data before November, 1994
drop if year< 1995

****First I use jwdid package to implmemnt extended two-way Mundlak (ETWM) regression of "Wooldridge, Jeffrey. 2021.  Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Differences-in-Differences estimators. Working paper."
*Estimation of ATTGTs without controls using not-yet treated groups
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson)
estat simple

*Using Never treated as controls
jwdid accidents_per_year, ivar(crossingid) tvar(year) gvar(first_treat) method(poisson) never 
estat simple
eststo clear
* Note: For the new staggered models, I reported the results in the .tex files manually. Please refer to the "Latex Source Files" subfolder of "Outcome" folder. 

*--------------------------------------------------------------------------
* END
*--------------------------------------------------------------------------